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Abstract:  A  new  model  of  computation,  concurrent  tree  rewriting , 
is  proposed  as  a  bridge  between  easily  programmed  Ultra  High  Level 
Languages  (UHLLs)  featuring  implicit  concurrency,  and  an  advanced 
parallel  architecture  of  unprecedented  performance,  the  Rewrite  Rule 
Machine  (RRM)  architecture.  At  the  highest  level  of  abstraction,  com* 
putation  is  understood  as  rewriting  a  tree  at  multiple  sites  concurrently. 
Less  abstractly,  such  a  (possibly  very  large)  tree  can  be  partitioned  into 
fragments  that  are  assigned  to  different  processors,  with  each  proces¬ 
sor  doing  concurrent  rewriting  on  its  own  fragment  of  the  tree;  this 
gives  the  second  level,  partitioned  concurrent  rewriting.  After  introduc¬ 
ing  the  basic  concepts  and  properties  of  the  model,  we  discuss  tradeoffs 
between  tree  and  directed  acyclic  graph  (dag)  data  representations;  we 
also  study  partitioned  concurrent  rewriting,  including  tree  and  rule  parti¬ 
tioning,  and  discuss  evaluation  strategies  as  a  flexible  control  mechanism 
for  concurrent  rewriting.  The  mathematical  definitions  are  gathered  in 
an  appendix.  - 


1  Introduction 


This  report  documents  recent  progress  on  models  of  computation  for  the  Rewrite 
Rule  Machine  (RRM)  project  at  SRI  International.  The  next  paragraphs  of  the 
introduction  discuss  the  overall  goals  of  the  RRM  project,  software  issues  that  our 
model  solves,  and  the  model  of  computation.  Section  2  studies  the  most  abstract 
level  of  concurrent  rewriting,  and  Section  4  treats  partitioned  tree  rewriting.  Trade¬ 
offs  between  the  tree  and  directed  acyclic  graph  representations  are  discussed  in  Sec¬ 
tion  3,  and  rewriting  strategies  in  Section  5.  For  an  overview  of  the  RRM  project, 
the  reader  is  referred  to  (5).  Results  on  simulation  are  reported  in  [13].  The  archi¬ 
tectural  design,  including  designs  for  tree  representation  and  rewriting  inside  each 
processor,  is  documented  in  [11]  |  j. 
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1.1  Goals  of  the  RRM  Project 

The  purpose  of  the  RRM  project  is  to  design  and  test  a  prototype  general  purpose, 
high-speed,  large-scale,  parallel  computer  architecture  that  is  especially  suitable 
for  symbolic  computation  with  ultra-high-level  languages  (UHLLs).  One  promis¬ 
ing  application  area  is  the  support  of  powerful  program  development  environments, 
including  “intelligent”  editors,  compilers,  libraries  of  reusable  software,  rapid  proto¬ 
typing  tools,  formal  specification  languages,  program  verifiers,  debuggers,  test  case 
generators,  etc.  Artificial  Intelligence  applications  are  also  promising,  including 
robot  planning,  natural  language  processing,  high  level  vision,  and  expert  systems. 
A  third  promising  area  is  hardware  simulation. 

The  RRM  will  consist  of  a  large  number  of  processors,  each  with  custom  VLSI  to 
process  tree-structured  data  independently  and  very  efficiently.  The  UHLLs  con¬ 
sidered  will  combine  object-oriented,  functional,  and  logic  programming,  plus  other 
powerful  features,  including  parameterized  programming,  graphical  programming, 
sophisticated  error  handling,  and  powerful  type  systems.  Compilers  will  convert 
UHLL  programs  into  sets  of  rewrite  rules  for  execution  on  the  RRM. 

1.2  Software  Issues 

From  the  software  point  of  view,  tree  processing  means  that  manipulations  can 
easily  be  described  in  a  way  that  is  independent  of  the  order  of  execution,  and 
that  also  provides  ample  opportunities  for  concurrent  execution.  The  basic  mode 
of  tree  processing  is  called  tree  (or  term1)  rewriting  (or  replacement  or  reduction), 
and  refers  to  the  replacement  of  one  subtree  by  another,  whenever  a  tree-structured 
template  is  matched.  A  rewrite  rule  consists  of  two  such  templates,  one  for  the 
subtree  to  be  replaced,  and  another  that  determines  what  it  is  to  be  replaced  by. 

1.2.1  Programmability 

We  feel  that  programmability  is  one  of  the  most  critical  issues  blocking  further 
progress  in  parallel  computation:  it  does  little  good  (except  for  very  homogeneous 
problems)  to  provide  lots  of  processors  if  the  programmer  has  to  explicitly  assign 
processes  to  processors.  We  feel  that  the  best  approach  to  combining  hardware  ef¬ 
ficiency  with  programming  ease  and  flexibility  is  to  have  a  model  of  computation 
that  provides  a  simple  bridge  between  a  powerful  UHLL  and  the  hardware  itself. 
We  argue  that  tree  rewriting  is  such  a  bridge.  As  shown  in  this  report,  concurrency 
is  implicit  in  the  rewrite  rules  themselves  and  can  be  directly  exploited  by  the  model 
and  the  architecture  without  any  explicit  concurrency  constructs  in  the  program¬ 
ming  language.  Work  on  programming  language  semantics  shows  how  to  implement 
advanced  languages  with  tree  rewriting.  We  have  taken  OBJ2  [4]  as  our  basis.  This 
is  a  very  advanced  functional  UHLL  based  on  tree  rewriting  with  a  uniquely  power¬ 
ful  generic  module  facility  and  type  system.  This  basis  has  been  extended  to  include 
logic  programming  [6j  and  object-oriented  programming  [7]. 

'Is  this  preseaUtioa  we  will  oftea  see  tke  word  “tree*  as  a  ijnoijra  for  ‘term,"  except  whea 
disease iag  data  represeatatioas  for  terns,  wkere  we  will  distiagaisk  between  tree  aad  dag 
represeatatioas. 
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1.2.2  IVaa-Structurad  Data  and  Computation 

There  are  many  applications  in  which  the  data  are  naturally  tree-structured  (in  the 
sense  that  there  is  a  natural  hierarchy,  with  a  'root*  node  at  the  top  and  with  one  or 
more  branches  from  each  node  that  is  not  a  tip  node),  and  for  which  tree  rewriting 
is  a  highly  efficient  and  natural  mode  of  execution.  Some  examples  of  naturally 
occurring  tree  structures  are: 

•  Menus,  such  as  occur  in  interactive  graphics, 

•  Expressions,  such  as  (A  +  B)(A*  +  3)  and,  more  generally,  programs, 

•  Natural  language  syntax,  and  many  other  structures  that  occur  in  natural  or 
artificial  languages,  such  as  plans  and  explanations, 

•  Most  generally,  any  abstract  data  type. 

In  fact,  tree  structure  is  a  fully  general  form  of  data  structuring,  since  any  com¬ 
putable  function  can  be  seen  as  a  computation  on  tree-structured  data  that  rewrites 
subtrees  into  other  subtrees.  In  particular,  such  processes  as  selecting  a  particular 
item  from  a  menu,  evaluating  an  arithmetic  expression,  verifying  a  program,  assign¬ 
ing  a  meaning  to  a  sentence,  editing  a  program,  constructing  a  plan,  and  compiling 
or  interpreting  a  program  can  all  be  conveniently  described  as  tree  rewriting  pro¬ 
cesses.  The  feasibility  of  writing  nontrivial  programs  with  rewrite  rules  has  been 
shown  by  experience  with  programming  languages  like  OBJ,  Hope,  Miranda,  and 
FP,  and  by  the  work  of  Hoffman  and  O'Donnell.  For  example,  several  different 
language  interpreters  have  been  written  in  OBJ,  including  one  for  OBJ. 

l.S  Models  of  Computation 

Previous  tree  rewriting  studies  have  addressed  the  sequential  case.  In  this  report  we 
develop  the  basic  concepts  for  concurrent  tree  rewriting,  i.e.,  the  rewriting  of  a  tree 
at  multiple  sites  concurrently.  This  amounts  to  a  new  model  of  computation  with 
some  properties  analogous  to  sequential  tree  rewriting  and  with  new  properties  of 
its  own.  An  issue  of  crucial  importance  for  concurrent  tree  rewriting  is  whether  to 
represent  data  as  trees  or  as  directed  acyclic  graphs  (dags).  Important  efficiency 
and  communication  tradeoffs,  and  even  more  general  ways  of  performing  concurrent 
rewriting  appear,  when  considering  the  choice  of  trees  vs.  dags.  Since  the  RRM 
architecture  provides  a  network  of  processors  such  that  each  can  do  parallel  tree 
rewriting  and  several  processors  can  cooperate  in  rewriting  a  large  tree,  a  natural, 
more  concrete,  refinement  of  the  model  is  to  consider  partitioned  concurrent  term 
rewriting.  Important  issues  for  partitioned  rewriting  include:  (i)  finding  appropriate 
criteria  for  tree  partitioning;  (ii)  efficiency  issues  related  to  the  partitioning  of  the 
rules,  since  each  processor  does  not  need,  and  cannot  store,  the  entire  set  of  rewrite 
rules  that  make  up  a  large  program,  and  not  all  the  rules  are  equally  expensive; 
(iii)  communication  issues  when  rewriting  takes  place  at  the  border  separating  two 
or  more  fragments  of  the  tree.  An  additional  problem  that  we  address  is  rewriting 
strategies.  Strategies  can  be  used  as  a  flexible  control  mechanism  that,  while  still 
allowing  concurrency,  avoids  useless  computations  that  could  take  up  considerable 
resources. 


2  Concurrent  Tree  Rewriting 

This  section  provides  an  informal  introduction  to  concurrent  tree  rewriting.  Al¬ 
though  the  style  is  informal  and  intuitive,  the  ideas  introduced  here  can  be  made 
mathematically  precise.  Indeed,  formal  definitions,  although  not  essential  to  un¬ 
derstand  the  main  ideas,  are  nevertheless  important  in  order  to  develop  the  theory 
with  mathematical  rigor.  Mathematical  definitions  for  all  the  concepts  introduced 
in  this  section  can  be  found  in  the  Appendix. 

In  the  concurrent  tree  rewriting  model  of  computation,  data  are  trees  with  nodes 
labelled  by  function  symbols  and  leaves  labelled  by  constants,  and  programs  are 
sets  of  equations  that  are  interpreted  as  left  to  right  rewrite  rules.  The  left-  and 
righthand  sides  of  an  equation  are  trees  with  nodes  labelled  by  function  symbols 
and  leaves  labelled  by  constants  or  variables.  A  variable  can  be  instantiated  by  any 
tree  of  the  type  given  to  the  variable;  such  an  instantiation  of  a  set  of  variables  is 
called  a  substitution  or  match. 

Computation  is  tree  rewriting  (or  reduction)  by  matching  the  lefthand  side  of  a 
rewrite  rule  to  a  subtree  of  the  tree  to  be  evaluated  and  then  replacing  that  subtree 
by  a  corresponding  instance  of  the  righthand  side  of  the  rule.  For  example,  by 
instantiating  the  variable  N  to  the  constant  6,  the  lefthand  side  of  the  rewrite  rule 

(*)  fibo(N)  ->  fiboCN  -  1)  ♦  f ibo(N  -  2) 

(where  fibo  is  the  function  symbol  for  Fibonacci  numbers)  can  be  matched  to 
f  ibo(O)  in  the  term  (iibo(6)  ♦  f  ibo(6))  ♦  0  which  can  be  also  represented  by 
the  tree 


llbo  fibo 
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The  subterm  f  ibo(O)  where  the  match  has  occurred  is  called  a  redes.  The  redex 
subterm  can  then  be  replaced  by  the  corresponding  instance  of  the  righthand  side, 
so  that  the  original  term  is  rewritten  to  the  term  ((fibo (6-1)  ♦  fibo(6-2))  ♦ 
fibo (6))  +  0.  Rewriting  by  applying  one  rule  at  ony  one  location  at  a  time  is 
called  sequential  tree  rewriting.  If  the  rewrite  rule  (*)  had  been  applied  to  f  ibo(6) 
instead,  one  step  of  sequential  rewriting  would  have  yielded  (fibo(6)  ♦  (f  ibo(5- 
1)  ♦  fibo(6-2)))  ♦  0. 

Notice  that  the  rule  (*)  could  instead  have  been  applied  concurrently  to  both 
f  ibo(6)  and  f  ibo(6),  yielding 

((fibo(6-l)  ♦  fibo (6-2))  ♦  (f ibo(S-l)  ♦  fibo(5-2)))  ♦  0. 

We  call  this  concurrent  tree  rewriting.  In  concurent  tree  rewriting  several  rules  can 
simultaneously  be  applied  and  several  matches  for  each  one  of  those  rules  can  be 
rewritten  in  a  single  step  of  concurrent  computation.  For  example,  by  applying  the 
rule  (*)  concurrently  in  a  first  step,  and  then  applying  both  the  rule  N  ♦  0  ->  N 
and  a  rule  for  subtraction  in  a  second  concurrent  step,  we  can  transform  the  original 
tree  into  the  tree 


fibo 


iibo 


fibo 


fibo 


in  two  step*  of  concurrent  rewriting.  This  process  continues  until  there  are  no 
more  matches,  then  the  expression  is  said  to  be  reduced  or  in  normal  form.  This 
simple  example  shows  that  tree  rewriting  is  by  its  very  nature  concurrent.  We 
emphasise  that  the  concurrency  is  implicit  in  the  rewrite  rules  themselves,  and  no 
explicit  concurrency  constructs  are  required  in  the  language.  We  see  this  as  a  major 
advantage. 

A  set  of  rules  is  called  terminating  if  all  possible  ways  of  concurrently  rewriting  a 
term  do  eventually  stop  in  a  normal  form  (the  normal  form  reached  in  each  case 
may  in  general  be  different).  Some  equations  are  nonterminating  and  should  not  be 
used  as  rewrite  rules;  for  instance,  a  commutativity  law  X  ♦  Y  ■  T  ♦  X  would  lead 
to  the  infinite  chain  of  rewritings 


fibo(6)  ♦  fibo(S)  ->  fibo(5)  ♦  fibo (8)  ->  fibo(fl)  ♦  fibo(5)  -> 


For  functional  computations,  one  expects  the  final  result  not  to  depend  on  the 
particular  chain  of  rewritings  that  led  to  it,  i.e.,  one  expects  all  normal  forms  to  be 
equal.  This  property  is  guaranteed  to  hold  if  the  rules  satisfy  the  Church-Rosser 
property  illustrated  by  the  diagram  below 


/  0 

./V 
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where  the  starred  arrows  denote  rewriting  sequences  of  0, 1,  or  more  steps  of  (possi¬ 
bly  concurrent)  rewriting.  The  Church-Rosser  property  says  that  any  two  rewriting 
sequences  starting  at  the  same  term  (solid  arrows)  can  always  be  reconciled  by  fur¬ 
ther  rewriting  (dotted  arrows).  For  nondeterministic  computations,  however,  it  is 
sensible  to  allow  reaching  different  final  results,  and  the  Church-Rosser  property 
should  not  be  expected  to  hold. 

If  a  lefthand  (resp.  righthand)  side  of  a  rule  has  only  one  instance  of  each  of  its 
variables,  the  rule  is  called  left -  (resp.  right-)  linear.  The  rule  (*)  above  is  an 
example  of  a  left-linear  rule.  Left-linear  rules  are  easier  to  match  than  non-left- 
linear  ones,  since  there  is  no  need  to  check  that  different  occurrences  of  a  variable 
are  instantiated  to  the  same  subterm  for  left-linear  rules. 

When  doing  concurrent  rewriting,  care  has  to  be  taken  of  the  case  when  two  lefthand 
sides  match  in  two  redexes  so  that  the  two  lefthand  sides  partially  overlap  each  other. 


Figure  1:  Overlapping  and  nonoveriapping  seta  of  redexes 


Rules  for  which  this  happens  are  called  overlapping,  or  telf-overlapping  if  overlapping 
occurs  with  the  same  rule.  For  instance,  the  associativity  rule 

(x  ♦  y)  ♦  z  ->  x  ♦  or  ♦  z) 

overlaps  with  itself,  so  that  it  has  the  entire  expression  ((5  +  7)  +  9)  +  7  and  the 
subtera  (5  +  7)  +  9  as  red  exes.  Tins  poses  a  problem  for  the  well-definedness  of 
concurrent  rewriting  because  there  is  a  ‘clash*  of  redexes  due  to  overlapping;  this  is 
illustrated  in  Figure  1.  However,  later  in  this  report  we  shall  see  that  such  clashes 
can  be  tolerated  if  a  dag  representation  is  chosen  for  terms. 

3  Trees  versus  Dags 

An  important  question  is  how  to  represent  trees  at  the  hardware  level.  An  impor¬ 
tant  choice  is  whether  or  not  to  use  dags  (i.e.,  directed  acyclic  graphs),  which  permit 
sharing  of  identitical  subtrees.  We  have  considered  the  advantages  and  disadvan¬ 
tages  of  dag  versus  strict  tree  structure  for  abstract  concurrent  tree  rewriting  and 
for  partitioned  tree  rewriting2. 

The  problem  of  choosing  between  a  tree  representation  and  a  dag  representation  for 
terms  is  not  a  new  one.  Several  studies  on  dag  rewriting  already  exist,  including 
work  such  as  (1]  and  [3].  But,  to  the  best  of  our  knowledge,  none  of  the  previous 
studies  deals  with  concurrent  rewriting  or  has  architectural  concerns  in  mind. 

S.l  Urees  and  Daga 

We  have  already  discussed  and  given  examples  of  tree  representations.  The  dag 
representation  generalises  this.  Notice  that  a  traa  is  a  particular  kind  of  directed 
acyclic  graph,  or  dag,  having  a  unique  entering  node  (i.e.  a  unique  node  which 
is  not  the  target  of  any  edge)  and  such  that  each  other  node  is  in  the  target  of 
exactly  one  edge.  Each  dag  with  nodes  labelled  by  function  symbols  and  having 

•The  tree  re.  dag  qaestioa  is  very  importaat  st  d  if  ere  at  level*  of  oaodelfiag,  with  different  tradeoffs 
appears  g  at  eael  level  Sizaalatioa  resahs  will  be  aaad  to  decide  at  what  level*  tree*  or  dag* 
shoald  be  chose*. 
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an  entering  node  from  which  all  other  nodes  are  reachable  can  be  transformed  into 
a  (labelled)  tree  by  successively  splitting  the  nodes  that  are  the  targets  of  several 
arrows.  We  then  view  all  the  dags  that  can  be  transformed  into  the  same  tree  as 
different  representations  of  the  same  term.  For  instance,  the  term 

/(/(/(*,  *),  /(*,  *)),  /(/(*,  *),  /(*,  *))) 

has,  among  other,  the  following  two  representations: 


3.2  Comparison  of  tha  Two  Representations 

5.2.1  Space  Occupation 

Obviously  the  dag  representation  is  more  space  efficient  than  the  tree  representation, 
which  had  more  nodes.  There  is  always  a  dag  representation  of  maximum  space 
efficiency,  i.e.,  having  a  minimum  number  of  nodes,  called  a  fully  shared  dag 
representation.  The  dag  on  the  right  in  the  example  in  section  3.1  is  fully  shared. 
For  that  example,  the  space  (number  of  nodes)  occupied  by  the  tree  was  2*  -  1  as 
opposed  to  4  for  the  dag  representation.  Thus,  the  space  of  the  tree  representation 
could  be  in  an  exponential  relation  to  the  space  of  the  dag  representation.  In  general, 
however,  a  term  will  not  be  sufficiently  regular  to  allow  an  exponential  gain  in  space 
by  full  sharing.  Statistics  on  the  space  efficiency  of  the  dag  representation  for  a 
collection  of  examples  are  given  in  [13]. 

5.2.2  Criteria  Related  to  Matching 

In  order  to  detect  a  match,  one  may  have  to  do  either  of  the  following: 

test  for  equality  of  subtarms.  For  example,  to  match  the  non-left-linear  rule  x+ 
x  -*  x  equality  of  the  two  subterms  matching  the  variable  x  needs  to  be  checked 
for  the  tree  representation,  hi  a  fully  shared  dag  representation  checking  for 
such  an  equality  is  trivial.  Another  example  is  given  later  (see  Figure  3). 

handle  simultaneous  read  accesses  to  the  same  node.  For  example,  match¬ 
ing  the  rule  f(h(z)>h(g(y)))  -»  q(x,y)  to  the  dag 


$ 

<;> 

will  involve  a  simultaneous  read  access  to  the  node  labelled  h.  Such  a  concur¬ 
rent  read  must  be  supported  by  the  hardware  in  all  cases  for  dag  representa¬ 
tions,  even  for  nonoverlapping  rules  (note  that  the  rule  in  the  example  is  not 
self-overlapping).  For  the  tree  representation,  simultaneous  read  access  to  a 
node  can  happen  only  when  matching  overlapping  rules. 


The  first  point  is  clearly  an  advantage  of  the  dag  representation  and  especially  of 
the  fully  shared  dag  representation,  where  the  test  of  equality  is  equivalent  to  the 
test  of  identity.  But  since  rewriting  of  fully  shared  dags  cannot  be  implemented 
efficiently,  the  gain  will  be  only  partial.  Nevertheless,  a  successful  test  for  equality 
can,  in  the  case  of  a  dag  representation,  be  used  to  increase  the  amount  of  sharing, 
thru  freeing  storage  resources. 

The  second  point  shows  a  partial  advantage  of  the  tree  representation  since,  even 
if  this  kind  of  simultaneous  read  access  may  appear  for  trees  in  the  case  of  self¬ 
overlapping  rules,  it  will  be  much  less  frequent  than  for  dags. 

3.2.3  Criteria  Related  to  Tree  Replacement 

We  now  consider  the  problem  from  the  point  of  view  of  tree  replacement.  There  are 
again  several  aspects: 

number  of  copies  needed.  Many  rules,  such  as  the  distributive  rule 

x*(y  +  z)-+(x*y)  +  (x*z) 

are  not  right-linear.  For  dags,  duplication  of  a  variable  by  rewriting  involves 
only  modification  of  a  pointer,  but  for  trees  one  has  to  copy  the  duplicated 
subtrees,  which  can  be  very  large. 

direct  modification  of  node  labels.  Such  a  modification  is  correct  in  the  case 
of  trees  but  not  in  the  case  of  dags  (except  for  the  node  corresponding  to  the 
top  of  the  redex).  For  instance,  rewriting  the  tree  representation 
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by  means  of  the  rewrite  rule 
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can  be  accomplished  just  by  modifying  the  node  at  occurrence  2.1,  to  obtain 

A 

a  b  h  b 


a 

whereas  for  the  dag  representation 


a  local  copy  is  required.  The  result  is: 


Figure  2:  Concurrent  dag  rewriting  with  the  associativity  rule. 


overlapping  of  rules.  By  implementing  a  graph  rewriting  which  modifies  only  the 
node  at  the  top  of  the  redex,  one  can  rewrite  concurrently  without  worrying 
about  self-overlapping  rules.  A  natural  example  using  the  associativity  rule 
{(xfy)fz)  -*  (xf{vfz))  is  described  in  Figure  2.  Notice  that  this  kind  of 
rewriting  with  self- overlapping  rules  makes  no  sense  for  terms  or  trees. 

3.2.4  Other  Considerations 

Other  issues  are  also  affected  by  choice  of  representation: 

1.  Freeing  unreferenced  items  will  be  simpler  for  the  tree  representation,  since 
something  like  reference  counts  will  be  needed  for  the  dag  representation. 

2.  Simultaneous  attempts  to  read  a  value  needed  for  finding  a  match  or  for  testing 
equality  of  subterms,  may  induce  delays  in  the  case  of  the  dag  representation. 
However,  that  problem  may  be  solved  at  the  hardware  level. 

Figure  3  gives  examples  that  illustrate  the  different  criteria  of  comparison  that  we 
have  discussed  between  tree  and  dag  representations  of  a  term. 
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The  normal  form  b  is 
reached  in  one  step 
of  concurrent  rewrit* 
ing. 

Depending  on  how  the  reduction 
of  dags  is  implemented,  the  nor¬ 
mal  form  (b)  is  reached  in  one 
(resp.  two)  step(s)  of  concur¬ 
rent  rewriting.  The  second  step 
would  be  needed  if  simultaneous 
read  access  to  a  node  is  sequen- 
tialised.  In  this  case  we  obtain 
first 

& 

and  then  b. 

The  normal  form 

4 

is  reached  by  one  step 
of  concurrent  rewrit¬ 
ing  at  occurrences  1 
and  2.1. 

The  normal  form 

A 

\\ 

c  b 

is  reached  by  one  step  of  concur¬ 
rent  rewriting  at  occurrence  2.1. 

One  needs  to  test  eq¬ 
uality  of  the  subtrees 
at  occurrences  1  and 
2.1.  TLe  result  b  is 
obtained  in  one  step 
of  concurrent  rewrit¬ 
ing  at  occurrence  r. 

The  same  result  is  obtained  with 
the  same  rewriting  but  the  equal¬ 
ity  test  is  trivial 

A  *\ 

ibab 

a  b 

By  only  modifying  the 
cell  at  occurrence  2.1, 
one  obtains 

The  previous  trick  can  not  be 
used  and  local  copying  is  re¬ 
quired.  The  result  is 

A  A 

A  k 

Figure  3:  Impact  of  the  term  representation  on  rewriting 
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3.3.5  Summary 

The  following  table  summarises  the  points  we  have  discussed. 


Tree 

Dag 

Advantages 

•  There  is  no  shared  struct¬ 
ure  and  thus  no  overhead  due 
to  multiple  read  access  to  a 
node. 

•  Allows  (possibly  maximal)  shar¬ 
ing  and  thus  is  more  space- 
efficient. 

•  Testing  equality  of  subtrees  is 
generally  efficient  and  is  trivial 
in  the  case  of  maximal  sharing. 

•  No  copying  is  needed  when 
subtrees  are  duplicated. 

•  There  is  no  need  to  avoid  over¬ 
lapping  in  concurrent  rewriting. 

Drawbacks 

•  Testing  equality  of  subtrees 
is  expensive. 

•  Copying  of  duplicated  sub¬ 
trees  is  needed,  which  can  be 
expensive  for  large  trees. 

•  Overlapping  redexes  require 
special  treatment. 

•  Rewriting  may  require  local 
duplication  in  the  dag. 

4  Partitioned  Term  Rewriting 

In  order  to  be  able  to  execute  large  programs  on  the  RRM,  both  the  terms  to  be 
reduced  and  the  set  of  rules  constituting  the  program  have  to  be  partitioned: 

•  Terms  should  be  partitioned  among  different  processors  because: 

-  The  size  of  each  processor  is  limited, 

-  Each  processor  is  working  in  a  SIMD  mode  and  in  general  the  terms  are 
not  homogeneous  (in  the  sense  that  the  function  symbols  appearing  in 
different  parts  of  the  tree  may  be  quite  different)  so  that  the  potential 
for  parallelism  cannot  be  exploited  if  the  tree  is  not  partitioned. 

•  The  set  of  rules  is  partitioned  because: 

-  The  size  of  each  processor’s  rule  memory  is  limited, 

-  For  efficiency  purposes  it  is  appropriate: 

*  To  increase  the  number  of  successful  matches  by  flow  analysis,  which 
allows  localising  the  set  of  rules  that  can  possibly  apply  on  a  given 
term, 

*  To  isolate  as  much  as  possible  the  rules  involving  node  or  processor 
communication  such  as: 

•  Non-left-linear  rules  which  require  testing  for  equality  of  sub¬ 
terms, 

•  Overlapping  rules,  which  require  mutual  exclusion  of  matching 
in  a  given  neighborhood. 

All  these  points  are  developed  and  discussed  below. 


4.1  Partitioning  of  Rule* 

4.1.1  Stratification  by  Hula  Complexity 

The  complexity  that  we  have  in  mind  regards  difficulty  in  matching  a  rule.  The 
concurrent  matching  process  is  simplest  for  sets  of  left-linear  nonoverlapping  rules; 
it  becomes  more  complex  with  the  presence  of  non-left-linear  or  self-overlapping 
rules.  The  worst  case  is  a  rule  that  is  not  left-linear  and,  in  addition,  overlaps  with 
itself. 

A  left-linear  rule  is  less  complex  than  a  nonlinear  one.  For  instance,  rule  (0)  in  the 
rational  numbers  example  (Figure  4)  is  the  only  left-linear  rule.  To  decide  whether 
rule  (0)  matches  the  term 

(t)  (3  /  (2  /  (3  *  7)))  /  (3  /  (2  /  (6  •  7))) 

at  the  top,  it  is  not  enough  to  see  that  the  two  subterms  below  the  top  J -  symbol 
are  nonzero  rationale;  one  has  to  check  for  equality  of  those  two  subterms,  namely  3 
/  (2  /  (3  *  7))  and  3  /  (2  /  (5  ♦  7)).  Left-linear  rules  are  simpler  because 
matching  only  requires  local  inspection  of  the  tree  in  a  region  the  size  of  the  rule’s 
lefthand  side.  For  instance,  rule  (1)  matches  at  the  top  by  instantiating  R  to  3  / 
(2  /  (3  *  7))  ,  R*  to  3,  and  8*  to  2  /  (5  *  7);  no  further  inspection  of  those 
three  subterms  is  required. 

Self-overlapping  rules  are  harder  to  match  in  parallel  than  rules  that  do  not  overlap 
with  themselves.  This  is  because  two  matches  of  the  same  rule  can  overlap  with  each 
other  imposing  an  additional  communication  overhead  to  arbitrate  such  conflicts. 
For  instance,  rule  (1)  below  overlaps  with  itself  and  also  matches  the  left  subterm  of 
the  term  (f)  above  by  instantiating  R  to  3  ,  R*  to  2,  and  8*  to  3  *  7.  Rule  (1)  also 
matches  the  right  subterm  of  (f)  in  a  completely  similar  way.  Thus,  attempting  to 
match  rule  (1)  in  parallel  to  the  term  (f)  will  require  communication  to  resolve  the 
contention  between  those  three  overlapping  matches.  An  even  worse  case  appears 
when  the  rule  is  both  non-left-linear  and  self-overlapping;  then,  both  communication 
costs  (for  deciding  equality  of  subterms  and  for  resolving  contention  of  overlapping 
matches)  have  to  be  paid.  For  instance,  the  rule 

R*  *  «1  /  R*)  *  R)  -  R 

is  an  example  of  a  non-left-linear,  self-overlapping  rule. 

Given  a  set  of  rules,  we  may  want  to  stratify  the  set  according  to  rule  complexity 
and  then  partition  each  stratum  into  maximal  sets  of  nonoverlapping  rules  whenever 
possible.  In  this  way,  we  can  obtain  a  partition  of  the  set  of  rules  into  subsets  that  is 
optimal  from  the  efficiency  point  of  view.  More  efficient  rules  could  be  tried  first  and 
less  efficient  rules  could  be  isolated  and  adequately  postponed.  Thus,  sets  of  rules 
can  be  organized  from  more  to  less  efficient  according  to  the  following  categories: 

1.  Sets  of  left-linear,  nonoverlapping  rules 

2.  Sets  of  non-left-linear,  nonoverlapping  rules 

3.  Sets  of  left-linear  self-overlapping  rules 

4.  Sets  of  non-left-linear  self-overlapping  rules. 


obj  RAT  is 
protecting  ZNT  . 
sorts  NzRat  Rst  . 
subsorts  Int  <  Rat  . 
sobsorts  Nzlnt  <  NzRat  <  1st  . 
op  _/_  :  Rat  NzRat  ->  Rat  . 
op  _/_  :  NzRat  NzRat  ->  NzRat  . 

op  :  Rat  ->  Rat  . 

op  :  NzRat  ->  NzRat  . 

op  «♦_  :  Rat  Rat  ->  Rat  [assoc  coaa]  . 

op  :  Rat  Rat  ->  Rat  [assoc  coaa]  . 

op  :  NzRat  NzRat  ->  NzRat  [assoc  coaa]  . 

▼ars  R  8  :  Rat  . 

Tars  R*  8*  7*  :  NzRat  . 
oq  :  R*  /  R*  -  1  . 

sq  :  R  /  (R*  /  8*)  •  (R  *  8*)  /  R*  . 
sq  :  (R  /  R*)  /  8*  -  R  /  (R*  •  8*)  . 

esq  :  J*  /  I*  ■  qnot(J*,gcd(J*.X'))  /  qaot(I* .ged(J* .!*)) 
if  gcd(JM')  -/-  1  . 
oq  :  R  /  1  -  R  . 
sq  :  0  /  R*  •  0  . 

oq  :  R  /  (-  R*)  *  <-  R)  /  R*  . 

sq  :  -  (R  /  R*)  -  <-  R)  /  R*  . 

oq  :  R  ♦  (8  /  R*)  -  C(R  •  R')  ♦  8)  /  R*  . 

oq  :  R  *  (8  /  R*)  ■  (R  *  8)  /  R*  . 
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Figure  4:  The  rational  numbers  example 
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In  the  rational  numbers  example  (Figure  4),  the  only  non-left-linear  rule  is  rule  (0). 
The  overlap  between  rules  is  summarised  in  the  table  below: 


Rule 

Overlaps  with 

0 

1-9 

1 

0-9 

2 

0-9 

3 

0-2,  4,  6-9  i 

4 

0-3,  S,  7-9 

5 

1-2,  4,  6-9 

6 

0-3,  S,  7-9 

7-9 

0-6 

We  can  then  form  the  following  stratified  partition  of  the  set  of  rules  0-9: 

1.  Nan-overlapping  and  left-linear  {7,8,9},  {3,$},  {6,4} 

2.  Non-left-linear  and  nonoverlapping:  {0} 

3.  Self-overlapping  and  left-linear  {1,2} 

4.1.3  Ruin  Restriction  fay  Flow  Analysis 

This  is  a  method  to  reduce  the  amount  of  rules  to  be  tried  for  concurrent  match¬ 
ing.  It  can  be  very  useful  as  a  way  of  maximising  the  number  of  successful  matches 
when  reducing  a  given  tree.  It  can  also  help  using  the  rule  storage  resources  asso¬ 
ciated  with  each  processor  in  an  efficient  way,  minimising  rule  communication  cost. 
Additionally,  it  may  even  provide  a  useful  heuristic  for  tree  partitioning,  since  by 
clustering  of  the  flow  graph,  occurrence  of  critical  function  symbols  that  mark  tran¬ 
sitions  between  function  clusters  could  be  identified.  The  method  should  be  seen 
as  complementary  to  rule  stratification  by  complexity;  combining  the  two  methods 
together  one  obtains  a  stratification  of  rules  by  complexity  such  that  only  those 
rules  that  could  potentially  apply  to  a  given  problem  are  represented. 

The  key  idea  is  to  group  together  rules  having  the  same  function  symbol  at  the 
top  of  their  lefthand  side,  and  to  relate  such  sets  of  rules  by  flow  analysis  of  their 
corresponding  function  symbols.  Here  are  two  simple  notions  for  flow  analysis  in 
this  context: 

Definition  1  A  function  symbol  f  is  said  to  weakly  flow  into  a  function  symbol  g 
if  there  is  a  rule  with  f  at  the  top  of  its  lefthand  side  such  that  g  occurs  somewhere 
in  its  righthand  tide;  if,  in  addition,  g  does  not  occur  m  the  lefthand  side  of  the  rule, 
then  fit  said  to  strongly  flow  info  g,  i.e.,  strong  flow  is  a  subrelation  of  weak  flow. 

The  diagram  in  Figure  5  gives  the  flow  graph  for  the  equations  of  the  rational 
numbers  example.  Notice  that  ged  and  quot  are  integer  functions;  i.e.,  the  flow 
graph  cuts  across  different  OBJ  modules. 

We  can  use  such  a  flow  graph  at  compile  time  to  determine  what  rules  will  be  needed 
to  reduce  a  given  tree.  For  instance,  by  flow  analysis  of  the  rules  for  the  rational 
numbers,  a  compiler  could  determine  that  only  the  rules  {0,1, 2, 3, 4, 9}  and  the  rules 
for  ged  and  quot  will  be  needed  to  reduce  the  term 
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Figure  5:  Flow  graph  for  the  equations  of  the  rational  numbers  example 


<3  /  6)  /  (7  *  (16  /  12)). 

Thus,  to  reduce  such  a  term  we  can  restrict  the  original  partition  by  rule  complexity 
for  the  rational  numbers  example  to  obtain  a  smaller  partition: 

1.  Nonoverlapping  and  left-linear.  {3},  {4},  {9} 

2.  Non-left-linear  and  nonoverlapping:  {0} 

3.  Self-overlapping  and  left-linear  {1,2} 

Restricting  the  set  of  rules  by  means  of  flow  analysis,  as  in  the  above  example,  has 
two  obvious  advantages: 

•  The  rate  of  successful  matches  will  increase,  since  r?Ies  that  will  always  fail 
are  excluded, 

•  The  storage  of  rules  in  a  processor  is  facilitated,  since  fewer  rules  have  to  be 
stored. 

4.2  TVee  Partition 

Each  of  the  processors  of  the  RRM  has  a  limited  capacity,  so  that  terms  exceeding 
a  certain  site  cannot  be  stored  in  a  single  processor.  This  means  that,  when  trees 
get  too  big,  they  have  to  be  partitioned  so  that  some  upper  fragment  of  the  tree 
remains  in  the  original  processor  whereas  subtrees  below  that  fragment  are  shipped 
to  other  available  processors.  Besides  being  a  need  imposed  by  a  processor’s  storage 
capacity,  partitioning  of  a  tree  may  in  fact  be  advantageous  to  increase  the  amount 
of  concurrent  rewriting.  This  is  due  to  the  possibly  mm  homogeneous  structure  of  a 
large  tree,  so  that  portions  of  the  tree  that  are  distant  from  each  other  may  involve 
very  different  function  symbols.  Assuming  that  each  processor  will  do  concurrent 
rewriting  in  a  SIMD  mode,  lack  of  homogeneity  would  limit  parallelism,  since  match 
attempts  for  a  rule  can  succeed  only  in  some  fragment  of  the  tree.  If  trees  are 
partitioned  into  relatively  homogeneous  parts,  the  amount  of  concurrent  rewriting 
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can  increase,  since  now  all  fragments  of  the  tree  can  be  active  doing  concurrent 
rewriting  with  the  rules  appropriate  for  each  fragment. 

We  shall  address  two  issues  that  arise  in  tree  partitioning.  The  first  regards  criteria 
that  should  be  used  to  partition  a  tree,  i.e.,  when  and  where  should  a  tree  be 
partitioned;  the  second  has  to  do  with  communication  and  reconfiguration  problems 
when  term  rewriting  takes  place  across  a  partition  boundary,  thus  involving  several 
processors. 

4.2.1  Criteria  for  Tree  Partitioning 

TVees  should  not  be  partitioned  at  random;  rather,  the  criteria  used  should  be  to 
try  to  maximise  parallelism  and  to  minimise  communication  between  the  fragments 
of  the  partition  (i.e.,  interprocessor  communication,  since  each  fragment  will  be 
stored  inside  a  different  processor).  Regarding  tree  sixe,  there  should  be  a  certain 
sise  threshold,  related  to  the  maximal  storage  capacity  of  a  processor,  so  that  tree 
partitioning  begins  after  that  threshold  has  been  reached,  if  other  considerations 
do  not  force  it  before  then.  In  additon  to  sise,  the  following  factors  should  also  be 
taken  into  account  in  a  tree  partitioning  strategy: 

•  Expected  rate  of  tree  growth  associated  to  a  subterm.  This  can  be  guessed  by 
inspection  of  both  the  subterm  and  the  equations  associated  to  its  function 
symbols  (more  generally,  equations  of  other  function  symbols  closely  related  in 
the  flow  analysis  graph  to  those  in  the  term)  (cf.  Section  4.1.2).  Information 
on  this  matter  could  also  be  the  subject  of  annotations  given  by  the  pro* 
grammer  as  for  strategies  (cf.  Section  S).  When  a  subterm  with  high  growth 
rate  appears  in  the  tree,  such  a  subterm  could  be  sent  to  another  processor, 
especially  if  the  original  tree  already  exceeds  a  certain  sise. 

•  Flow  Analysis  Information  The  function  symbol  flow  graph  introduced  in  Sec¬ 
tion  4.1.2  may  be  used  to  detect  transitions  to  a  different  "homogeneous  com¬ 
ponent”  of  the  tree,  for  which  rules  different  than  the  ones  used  so  far  will 
apply.  Function  symbols  that  show  strong  flow  relations  with  each  other  could 
be  grouped  together  into  clusters,  with  each  cluster  corresponding  to  a  differ¬ 
ent  "homogeneous  component”  of  the  tree.  If  a  subterm  marks  the  transition 
to  a  different  homogeneous  component,  such  a  subterm,  together  with  the 
rules  for  the  new  component,  could  be  sent  to  a  new  processor.  However,  such 
transitions  may  be  hard  to  detect  if  the  clustering  of  function  symbols  does 
not  provide  sufficient  separation  between  homogeneous  components,  and  more 
experience  with  the  flow  analysis  technique  described  in  Section  4.1.2  will  be 
needed  to  assess  its  potential  for  tree  partitioning. 

As  with  other  issues  in  this  report,  design  of  a  tree  partitioning  strategy  that  takes 
advantage  of  the  above  factors  should  be  based  on  experimental  results  with  an 
ample  collection  of  examples.  The  approach  to  simulation  described  in  [13]  should  be 
extended  to  partitioned  tree  rewriting  in  order  to  provide  the  necessary  experimental 
basis. 


4.2.2  Twin  Rewriting  Across  Fragment!  of  a  Partition 

Treei  should  be  partitioned  ao  as  to  minimise  interproeessor  communication.  How¬ 
ever,  interprocessor  communication  may  be  unavoidable,  due  to  matching  attempts 
that  need  to  inspect  a  portion  of  the  tree  in  a  boundary  between  two  or  more 
fragments  stored  in  different  processors.  Blocking  such  an  attempt  could  result  in 
failure  to  attain  the  final  result  of  a  computation.  Two  related  questions  arise  in 
this  context: 

1.  How  should  matching  attempts  across  a  boundary  between  tree  fragments  be 
handled? 

2.  How  should  the  tree  be  reconfigured  after  a  successful  match  across  a  bound* 
ary? 

Regarding  the  first  question,  two  alternatives  that  can  be  considered  are:  (i)  to  ship 
a  fragment  of  the  tree  up,  to  the  parent  processor  requesting  the  match,  and  (ii) 
to  ask  the  child  processor  to  do  part  of  the  match.  The  second  alternative  seems 
preferable,  since  match  attempts  fail  most  of  the  time  and  may  require  unbounded 
inspection  of  the  child  subtree  for  non-left-linear  rules.  If  the  first  alternative  is 
chosen,  unnecessary  communication  cost  will  be  incurred  in  many  cases  when  a 
match  did  not  exist.  Also,  shipping  a  subtree  up  will  generally  increase  the  number 
of  links  between  processors,  since  a  link  to  the  top  of  the  child  subterm  would,  in 
the  first  alternative,  have  to  be  replaced  by  links  to  all  the  children  trees  of  the  tree 
been  shipped  up,  thus  increasing  the  amount  of  interprocessor  communication. 

The  configuration  question  can  be  considered  assuming  a  tree  representation  or  a 
dag  representation.  In  the  following  we  will  discuss  an  example  for  the  dag  case; 
the  tree  case  is  similar,  but  the  reader  should  be  aware  of  the  limitations  of  the  tree 
representation  discussed  in  Section  3. 

The  example  is  illustrated  in  Figure  6.  The  rule  f (§(*),  f)  —  k[r{x,x),h(y))  will 
match  across  trees  2*4.  Since  tree  3  originally  had  two  links  from  two  parent  trees, 
the  function  symbol  g  has  to  be  recopied  after  rewriting  so  that  the  link  with  tree 

1  remains  consistent.  How  should  the  righthand  side  be  partitioned  between  trees 

2  and  4?  One  option  is  to  insert  it  exclusively  within  tree  2;  this,  however,  would 
unnecessarily  increase  by  one  the  number  of  links  between  trees.  A  better  solution 
that  would,  on  the  average,  tend  to  balance  the  amount  of  tree  growth  after  a 
rewriting  on  a  boundary  would  be  to  partition  the  righthand  side  bringing  down  as 
much  of  it  as  possible.  This  leads  to  the  solution  in  Figure  6,  and  (in  this  case) 
avoids  increasing  the  number  of  links.  In  general,  the  amount  of  space  available  in 
each  processor  may  dictate  a  different  strategy  for  reconfiguring  the  tree  by  insertion 
of  the  righthand  side. 


5  Strategies 


For  most  computations,  the  order  of  evaluation  does  not  affect  the  final  result.  This 
informal  fact  has  a  formal  counterpart  in  the  Church-Roster  property  of  a  set  of 
rewriter  rules  guaranteeing  that  different  evaluations  of  the  same  tree  can  always  be 
reconciled  by  further  evaluation.  The  Church- Rosser  property  holds  for  concurrent 
rewriting  if  and  only  if  it  holds  for  sequential  rewriting,  and  indeed  this  property, 
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Figure  6:  Example  of  reconfiguration  of  a  tree  after  rewriting 


since  it  allows  rewritings  to  be  done  in  any  order,  ensures  the  fully  concurrent  nature 
of  tree  rewriting.  However,  there  are  reasons  that  may  make  advisable  imposing 
certain  control  mechanisms,  called  evaluation  strategies,  on  the  order  of  evaluation, 
although  evaluation  itself  remains  concurrent.  These  reasons  include: 

•  Space  efficiency,  since  certain  rewritings  may  perform  unnecessary  computa¬ 
tions  that  highly  increase  the  site  of  the  tree. 

•  Termination  of  computations  that  in  general  may  not  necessarily  terminate 
but  where  the  order  of  evaluation  matters  for  finding  a  final  result  when  there 
is  one. 

•  Concurrency  control  purposes,  when  term  rewriting  is  used  as  a  method  to 
implement  communication  among  processes. 

The  control  mechanism  that  we  have  been  exploring  is  that  of  a  tree  rewriting 
strategy.  The  notion  of  E-itrategf  (E  is  for  evaluation)  was  introduced  in  sequential 
OBJ2  [4]  as  a  powerful  and  flexible  way  to  control  the  order  of  evaluation,  so  as 
to  improve  efficiency  of  execution.  For  example,  given  a  three-  argument  operator 
/  with  strategy  [1  2  0  3]  and  an  expression  /(ti, ts,ts)  to  be  reduced,  the  first 
argument  (indicated  by  the  number  1)  must  be  reduced  first,  say  to  before 
reducing  the  second  argument  tj  to  tj  (indicated  by  the  number  2);  then  we  must 
rewrite  at  the  top  (indicated  by  the  number  0)  of  /(fpt^ts)  before  finally  going  on 
to  reduce  tj.  This  kind  of  sequential  evaluation  is  not  appropriate  for  concurrent 
rewriting;  fortunately,  however,  the  interpretation  of  E-strategies  can  be  generalised 
from  the  sequential  case  to  the  concurrent  case.  For  the  example  given  above,  we 
would  begin  by  evaluating  all  three  major  subtrees  of  /(ti,ts,ts)  concurrently,  until 
its  first  and  second  subtrees  are  reduced;  then  we  would  apply  rules  to  the  top  of 
the  resulting  tree  before  going  on  to  reduce  the  third  argument  further  (if  needed). 
More  generally  concurrent  E-strategies  can  be  provided  in  the  three  following  ways, 
listed  in  increasing  order  of  amount  of  control  being  imposed  on  the  computation: 

Concurrency  with  priority.  It  is  the  strategy  described  previously:  concurrency 
is  not  affected  until  the  arguments  having  priority  are  normalised,  and  then 


'.t  >«>, 


,8 


,-l  »'t. 


one  action  (in  the  previous  example  the  reduction  on  top)  is  given  the  highest 

priority.  j 

Concurrency  with  rendes-vous.  Reduction  is  executed  concurrently  but  some  i 

subterms  must  all  be  in  normal  form  before  reduction  at  the  top  is  performed.  j 

For  example  if  the  operator  /  has  the  strategy  (123|0)  then  no  reduction  on  ! 

top  of  the  tree  can  be  done  before  ti  and  tj  and  tj  are  all  normalised. 

Exclusivity.  That  is  the  strategy  which  assures  the  exclusivity  of  reduction  at  one 
occurrence.  For  instance,  if  the  operator  /  has  the  strategy  [!l!2!3!0]  it  means 
that  the  first  argument  ti  must  be  first  reduced  before  reductions  in  the  others 
subtrees  tj,  t3  and  on  top  are  permitted.  Then  fy  itself  is  exclusively  reduced, 
next  tj,  and  at  last  the  resulting  tree  f(t?v  tj.tj)  itself  is  reduced.  j 

We  have  found  that  such  concurrent  rewriting  strategies  can  yield  significant  savings  ] 

of  space  without  reducing  the  amount  of  useful  concurrency. 

More  generally,  it  appears  that  OBJ  with  E-strategies  can  be  used  to  specify  quite 
general  concurrent  processes,  for  example,  protocols.  If  so,  this  should  be  an  impor¬ 
tant  advance  in  specification  technology. 

A  Appendix:  Concurrent  rewriting 

A.l  Definitions 


Our  definitions  and  notations  are  consistent  with  those  of  G.Huet  and  D.Oppen  [9]. 
Given  a  set  X  of  variables  and  a  graded  set  F  of  function  symbols,  the  free  F-algebra 
over  X  is  denoted  T(F,X)  and  its  elements  are  called  terms.  Similarly  m any-sorted 
terms  can  be  defined  as  in  [12].  Terms  can  be  viewed  as  functions  from  the  free 
monoid  on  the  natural  numbers  denoted  N*  to  FUX.  The  domain  of  the  term  t 
considered  as  a  function  is  denoted  0(t)  and  is  called  the  set  of  occurrences  of  t. 
For  example,  t(e)  is  the  top  symbol  of  the  term  t.  Var(t)  denotes  the  set  of  variables 
of  t,  t|m  the  subterm  of  t  at  occurrence  m  (for  m  €  O(t)),  and  t[m_t']  the  term 
obtained  by  replacing  t|m  by  t?  in  t.  A  term  is  linear  iff  for  any  z  €  Varft),  z  has 
only  one  occurrence  in  t.  The  set  of  nonvariable  occurrence  in  a  term  is  denoted 

0(4). 

Substitutions  a  are  endomorphisms  of  T(F,X)  with  a  finite  domain  D(er).  A  sub¬ 
stitution  a  is  denoted  by  (zi «—  ti),  ...,(*«  *-  tn).  We  denote  by  op?  the  restriction 
of  the  substitution  a  to  the  subset  W  of  X.  If  E  is  a  set  of  substitutions  then 
Spy  =  {<rpy| o  6  E}  is  the  set  of  elements  of  E  restricted  to  W. 

An  axiom  is  a  pair  {*,<*}  of  terms  denoted  t  =  i.  A  rewrite  rule  is  a  pair  {t,  t'} 
of  terms,  denoted  t  -*  t*,  such  that  Var(t’)  is  a  subset  of  Var(t).  A  set  of  rewrite 
rules  is  called  a  Term  Rewriting  System  (TRS).  A  rewrite  rule  t  t1  is  called 
left-linear  (reap,  right-linear)  iff  t  (reap,  tf)  is  a  linear  term  i.e.  no  variable  occurs 
more  than  once  in  t  (reap.  t*).  For  example,  if  z  is  a  variable,  e  is  a  constant,  and  * 
is  a  binary  operator,  then  the  rule  z  •  e  -*  z  is  left-linear. 

If  R  is  a  TRS,  the  rewriting  relation  -*K  is  defined  as  follows:  t  -*R  t*  iff  there 
exists  an  occurrence  m  in  d(t),  a  rule  l  -*  r  in  R  and  a  substitution  a  such  that 
t|m  «  <r(l)  and  t*  =  t[»_*(r)].  t|m  is  called  a  red  ax  in  t  at  occurrence  m  under  the 
rewrite  rule  /  — *  r.  For  t  fixed,  a  redex  may  be  represented  by  the  3-tuple  (m,/,  r). 
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For  example,  for  the  previous  rule  *  *  e  -*  x,  (1  *  2)  *  e  is  a  redex  in  the  term 
2  *  (2  *  ((1  *  2)  *  «))  at  occurrence  2.2.  We  write  t  ,rj  t*  if  we  want  to  make 

explicit  the  occurrence,  the  substitution,  and  the  rule  involved  in  the  rewriting. 

A  nonvariable  term  t  overlap*  a  term  i  at  occurrence  m  €  O(t')  iff  there  exists 
a  substitution  $  such  that  9(t)  =  For  example  the  term  x  *  e  overlaps  at 

occurrence  1  the  term  (2  *  y)  +  3.  A  set  R  of  rules  is  called  nonoverlapping  iff 
for  any  pair  of  rules  l  -*  r,  i'-*r'inR  /  and  /'  do  not  overlap  each  other  at 
any  (nonvariable)  occurrence.  Otherwise  the  set  is  said  to  be  overlapping.  A  rule 
l  -*r  is  called  self-over  lapping  iff  1  overlaps  itself  at  an  occurrence  different  from 
t\  associativity  is  a  typical  example  of  a  self-overlapping  rule. 

Given  a  set  A  of  equations,  the  A-equality  relation  (denoted  =a)  is  the  smallest 
congruence  relation  closed  under  instantiation  and  generated  by  the  set  of  axioms 
A.  I— U  denotes  one  step  of  axiom  application. 

We  denote  by  <A  the  subsumption  preorder  on  T(F,X)  defined  by:  t  <4  t*  iff  t1  =4 
<r(t)  for  a  substitution  0  called  a  match  from  t  to  tf.  Composition  of  substitutions 
a  and  p  is  denoted  by  a.p  .  Given  a  subset  V  of  X,  we  define  a  <A  o'  [V]  iff 
o'  =a  0" .0  (V)  for  a  substitution  o". 

Let  t  be  a  term  and  R  a  term  rewriting  system.  Let  R(t)  =  {(«,-, /,-,r,)}  the  set  of 
all  the  redexes  in  t  under  R;  i.e. 

(«.,/.-,»■•)  €  R(t)  o  1,  -+  r<  €  R  and  3o  s.t.  t|a>  =  o(/,) 

Definition  2  A  subset  W  of  R(t)  is  said  to  be  nonoverlapping  (or  non-conflicting 
or  consistent^  iff  for  any  redexes  (u,  f,  r)  and  (u’,l’,r’)  in  W, 

•  u|ti'  ( i.e.  u  and  «’  are  incomparable)  or 

•  u  <  u'  and  3»  €  Oyu(l)  s.t.  u.e  <  u*  where  Oy m(l)  is  the  set  of  variable 
occurrences  in  l: 

Ov„{l)  =  {A|A  6  0(1)  and  1(A)  €  X} 

This  definition  is  illustrated  by  Figure  1  in  the  main  text. 

Let  A(t)  be  the  set  of  all  nonoverlapping  subsets  of  redexes  in  t. 

Definition  3  The  relation  -»*•  0/ concurrent  rewriting  is  then  defined  by 

W  =  {(«,,/„  r,)|l  <  i  <  n}  €  A(t) 
and 

e  -»£■  t'o  *  <  j  u,-  /  11/ 
and 

■  *  ~*fe l/liTll  **  ”*  ^ 

Note  that  the  last  condition  specifies  the  result  of  applying  the  set  of  redexes  in  W 
with  a  bottom-up  sequential  strategy. 

in  the  definition  above  it  is  possible  to  define  the  result  tf  using  the  notion  of  residual 
due  to  Church  as  defined  in  [8]  or  [2], 

Definition  4  Let  t  a  term,  R  a  TRS,  and  w ,•  =  (u ,,li,  >*•)•*  =  1,2  two  redexes  of  R 
in  t.  Then  the  residual  wj\w]  of  vj  by  wj  is  the  set  of  redexes  defined  by: 


1.  If  w |  does  not  cover  toj  (i.e.  ui|u?  or  Uj  <  uj  and  the  redexet  are  nonover¬ 
lapping)  then  «i\w3  =  {«i}; 

2.  If  3 v€  Ovar(^)  tvch  that  U|  —  uj.w.o'  t/ten 

«i\»2  =  =  «*•«'•»'  with  u'  €  Oym(rt,l^,)}; 

3.  Otherwise  «i\®}  =  0. 

This  notion  ie  now  extended  to  W,  a  set  of  redexet  of  R  in  t.  Let  w  =  (u,l,r )  be  any 
redex  of  R  in  t.  The  residual  of  W  by  reducing  the  redox  w  it 

W\v=  (J  »v* 

r 

This  allows  defining  the  notion  of  concurrent  reduction  by 

f  t^mMetor{uXr)ew 

“d  _ 

l  r  * 


Definition  6  The  relation  of  maximal  concurrent  rewriting  -»**  it: 


t 


‘-i'* 

and 

W  maximal  in(A(t),C) 


The  relation  of  maximally  concurrent  rewriting 


m: 


i  <* 


t-J't' 
and 

\  \W\  maximal  in  A(t)  for  < 


A.2  Properties  of  Concurrent  Rewriting 

Definition  A  Let  R  be  a  relation  on  the  tet  of  termt,  and  *  be  retpectivly 
its  reflexive  transitive  and  symmetric  reflexive  transitive  clotures. 

R  it  terminating  or  noetherian  iff  there  it  no  sequence  of  the  form 

to  R  t\  ...  tn  R.  tfi+t  ... 


R  ie  Church-Rower  iff 

Vt1,tj;  f|  *  tj  =►  3t!  such  that  t*  t*  and  tj  — t\ 
Proposition  1 

•  If  —K  ie  terminating  to  it  concurrent  rewriting,  and  in  particular  -»**. 

•  If  -*K  it  Church  Rotter,  then  to  it  concurrent  rewriting,  and  thus  — »**. 


It  is  not  easy  to  relax  the  above  termination  hypothesis  as  shown  by  the  following, 


Definition  7  A  TRS  is  weak  terminating  iff  every  term  hat  a  normal  form. 
Example  of  weak  terminating  system  (Barendregt,  Hnet):  Let  R  be  given  by 


F(z,z)->A 
G(x)  - F(x,G(x )) 

C  -*G(C) 

•  The  first  role  is  non-left-linear. 

•  Note  that  C  has  the  two  normal  forms  A  and  G(A)  (among  others),  and  that 
R  is  locally  confluent  [10]  (but  not  Church-Rosaer). 

•  The  concurrent  relation  R*  associated  to  R  is  not  terminating.  For  example 

F(C, G(C))  -*«  F(G(<7),F,(G(C),G(G(C))))  ... 
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